A central tension in our modeling is the one between explanation – good causal models – and prediction. In McElreath’s lecture, he leads us to the intuition that predictive models are generally those that do a terrible job of representing the causal model. So the tools covered in this lecture should be considered tools for prediction, but not for identifying causal models.
When trying to maximize prediction, we need to be wary of overfitting – when the model learns too much from the sample. Methods for avoiding overfitting favor simpler models. We’ll make use of regularizing, which helps stop the model from becoming too excited about any one data point. We’ll also discuss scoring devices, like information criteria and cross-validation.